Goto

Collaborating Authors

 data quality management


A Theoretical Framework for AI-driven data quality monitoring in high-volume data environments

Bangad, Nikhil, Jayaram, Vivekananda, Krishnappa, Manjunatha Sughaturu, Banarse, Amey Ram, Bidkar, Darshan Mohan, Nagpal, Akshay, Parlapalli, Vidyasagar

arXiv.org Artificial Intelligence

This paper presents a theoretical framework for an AI-driven data quality monitoring system designed to address the challenges of maintaining data quality in high-volume environments. We examine the limitations of traditional methods in managing the scale, velocity, and variety of big data and propose a conceptual approach leveraging advanced machine learning techniques. Our framework outlines a system architecture that incorporates anomaly detection, classification, and predictive analytics for real-time, scalable data quality management. Key components include an intelligent data ingestion layer, adaptive preprocessing mechanisms, context-aware feature extraction, and AI-based quality assessment modules. A continuous learning paradigm is central to our framework, ensuring adaptability to evolving data patterns and quality requirements. We also address implications for scalability, privacy, and integration within existing data ecosystems. While practical results are not provided, it lays a robust theoretical foundation for future research and implementations, advancing data quality management and encouraging the exploration of AI-driven solutions in dynamic environments.


Towards augmented data quality management: Automation of Data Quality Rule Definition in Data Warehouses

Tamm, Heidi Carolina, Nikiforova, Anastasija

arXiv.org Artificial Intelligence

In the contemporary data-driven landscape, ensuring data quality (DQ) is crucial for deriving actionable insights from vast data repositories. The objective of this study is to explore the potential for automating data quality management within data warehouses as data repository commonly used by large organizations. By conducting a systematic review of existing DQ tools available in the market and academic literature, the study assesses their capability to automatically detect and enforce data quality rules. The review encompassed 151 tools from various sources, revealing that most current tools focus on data cleansing and fixing in domain-specific databases rather than data warehouses. Only a limited number of tools, specifically ten, demonstrated the capability to detect DQ rules, not to mention implementing this in data warehouses. The findings underscore a significant gap in the market and academic research regarding AI-augmented DQ rule detection in data warehouses. This paper advocates for further development in this area to enhance the efficiency of DQ management processes, reduce human workload, and lower costs. The study highlights the necessity of advanced tools for automated DQ rule detection, paving the way for improved practices in data quality management tailored to data warehouse environments. The study can guide organizations in selecting data quality tool that would meet their requirements most.


Data quality trends to watch

#artificialintelligence

Data quality management efforts -- tied to disrupting innovations, rapid market shifts and regulation pressures -- will continue to grow in 2023 and take on a more dominant role in the data management ecosystem. Turning to the cloud, edge, 5G and machine learning, hybrid worldwide workforces and global customers are generating data at levels never experienced before. The success of data quality management depends on deployment, infrastructure and modernization strategies. The 2022 State of Data Quality report from Ataccama reveals that automation and modernization efforts have still not been universally adopted. While seven in ten enterprises surveyed (69%) have begun their DQM journeys, they still have not achieved high maturity levels.


How Is Data Quality Management Being Transformed by AI and ML?

#artificialintelligence

Technology has risen to prominence in recent years, both at work and at home. The fields of artificial intelligence (AI) and machine learning (ML) are advancing at a rapid pace right now. Almost everyone's everyday life will be impacted by AI in some way. Siri, Google Maps, Netflix, and social media (Facebook/Snapchat) are just a few examples. Artificial Intelligence and Machine Learning (ML) are two buzzwords that are frequently used interchangeably.


How AI & ML transforming data quality management - DataScienceCentral

#artificialintelligence

In recent years technology has become prominent, both at work and at home. Machine learning (ML) and Artificial Intelligence (AI) are evolving quickly today. Almost everyone will have some interaction with a form of AI daily. Some common examples include Siri, Google Maps, Netflix, and Social media (Facebook/Snapchat).AI and ML have popularly used buzzwords right now, often used interchangeably. Most experimentation has been geared to finding specific solutions to specific problems.


Top 10 Analytics & Business Intelligence Trends for 2019 (Infographic) – IntellectFaces

#artificialintelligence

The datapine report authored by Sandra Durcevic states that 2019 is the year of data discovery and data quality management. The report describes that the strategies in Business Intelligence will be customized which indicates a good success rate in adapting the Business Intelligence and analytics for their business. Artificial Intelligence: AI emphasizes the creation of intelligent machines that work and react like humans. Data discovery: For many companies, data discovery has seen a massive impact in recent years by the way the company uses manpower with curated data. This empowerment of users in the business is considered to be a recent trend at present according to the BI Practitioners.